The FSSA algorithm begins with the creation and visualization of functional time series (fts) data. The following is a univariate example where we create functional data objects using the ‘fda’ package from call center data.
data("Callcenter")
require(fda)
require(Rfssa)
## Define functional objects
D <- matrix(sqrt(Callcenter$calls),nrow = 240)
N <- ncol(D)
time <- seq(ISOdate(1999,1,1), ISOdate(1999,12,31), by="day")
K <- nrow(D)
u <- seq(0,K,length.out =K)
d <- 22 #Optimal Number of basis elements
basis <- create.bspline.basis(c(min(u),max(u)),d)
Ysmooth <- smooth.basis(u,D,basis)
At this point, we have created 365 curves where each curve is representative of the number of calls to the center on a given day, throughout that day. Next we convert the fd data object to an object of class fts.
## Define functional time series
Y <- fts(Ysmooth$fd,time = time)
In this version of the Rfssa package, we introduced plotting for fts objects that uses the ‘plotly’ package. The following shows examples of the different plotting options available to the user using our call center example.
Here’s a line plot showing the fts object where each curve is representative of calls received in a day, throughout the day.
plot(Y, main="Line Plot of Callcenter Data",xlab = "Intraday Intervals",ylab = "Sqrt of Callcenter", type='line')
This next plot is a heat map which offers a top-down view of the fts object so that the user can more clearly see patterns in the functions over time. We see that despite the day, the call center receives less calls early in the day and more calls later on in the day. We also see that there is periodicity present in the fts object as about every seven days, we have a change in function behavior. This is to be expected as the bank receives less calls on weekends.
plot(Y, main = "Heat Map of Callcenter Data",tlab="Days",xlab = "Intraday Intervals", type='heatmap')
The following 3D surface and 3D line plot options are helpful in the sense that it combines the information provided in the line plot with the information provided in the heat map plot. For our call center example, we again see the seven day periodic behavior.
plot(Y, xlab = "Intraday intervals",ylab = "Sqrt of Callcenter", type='3Dsurface')
plot(Y, xlab = "Intraday intervals",ylab = "Sqrt of Callcenter", type='3Dline')
In general, the addition of the fts class with the new plotting options for the object is extremely helpful and informative when analyzing functional time series data as seen in our univariate call center example. These plotting options give us clues into how we choose our lag parameter and what types of patterns we should be searching for in our FSSA plots. For instance, since we see a roughly seven day periodic component, we expect to see seven pop up in our functional singular spectrum analysis in some way such as in the paired-plots.
The FSSA algorithm also supports multivariate options. In addition to the update of the fts class and fts plotting options, we also allow the user to perform multivariate functional singular spectrum analysis (mfssa). So, if we are tracking more than one family of curves and they are correlated, we can strengthen our analysis by including that correlation in our analysis. In order to illustrate the use of the FSSA algorithm on mutlivariate fts objects, we give a bivariate example with the Jambi data set. Satellite images of the Jambi region in Indonesia were taken every 16 days over the course of about 20 years using NASA’s MODerate-resolution Imaging Spectroradiometer (MODIS). The goal was to track changes in vegetation in the Jambi region by using pixel values where a higher pixel value corresponds to more vegetation (greener areas) and a lower pixel value corresponds to less vegetation (less green areas). Two pixel indices were tracked known as the normalized difference vegetation index (NDVI) and the enhanced vegetation index (EVI) with globabl coverage at 250 meters squared. Our goal in this example is to use mfssa to uncover changes in these indices together as they are correlated in order to detect changes vegetation.
Since density functions are more informative than using average or maximum pixel values and we want to work with functional data, we start by converting each of our 448 images of pixel values into densities. This allows us to look at the probability density function of observing a pixel value between zero and one for both NDVI and EVI images for each of the 448 days that an image was taken. We estimate the density functions for both the NDVI and EVI data using the following
require(fda)
require(Rfssa)
## Raw image data
NDVI=Jambi$NDVI
EVI=Jambi$EVI
time <- Jambi$Date
## Kernel density estimation of pixel intensity
D0_NDVI <- matrix(NA,nrow = 512, ncol = 448)
D0_EVI <- matrix(NA,nrow =512, ncol = 448)
for(i in 1:448){
D0_NDVI[,i] <- density(NDVI[,,i],from=0,to=1)$y
D0_EVI[,i] <- density(EVI[,,i],from=0,to=1)$y
}
In this sense, we are forming two fd objects, one is the fd object whose elements are the densities of the NDVI pixel values and the other which is the fd object whose elements are the densities of the EVI pixel values. We form these objects with the following code.
## Define functional objects
d <- 11
basis <- create.bspline.basis(c(0,1),d)
u <- seq(0,1,length.out = 512)
y_NDVI <- smooth.basis(u,as.matrix(D0_NDVI),basis)$fd
y_EVI <- smooth.basis(u,as.matrix(D0_EVI),basis)$fd
y=list(y_NDVI,y_EVI)
At this point, we have two families of fd curves. One that corresponds to NDVI images and the other that corresponds to EVI images. These fd objects are really fts objects that were tracked over the course of 448 days. In this sense, we have a fts object of pixel densities for both the NDVI images and the EVI images. At this point, we convert the fd objects into a multivariate fts object using the following.
## Define functional time series
Y <- fts(y,time=time)
Just like in the univariate example of the call center data, we can visualize the multivariate fts object using line plots, heat maps, 3D surfaces, and 3D line plots that offer the same unique advantages described in the univariate example.
It is difficult to pick out any pattern in the mutlivariate fts object using the following line plot.
plot(Y,main="Density Estimations",xlab="Pixel Value",ylab=c("NDVI Pixel Density","EVI Pixel Density"), type="line")
We get a little more information out of the heat map plot for the multivariate fts object as there might be some seasonality at play with regards to the four seasons in a year. This is to be expected as the probability of observing a green pixel in the colder months will drop and will also rise in the warmer months. We should note here that the first heat map corresponds to the NDVI images while the second corresponds to the EVI images.
plot(Y,main="Density Estimations NDVI and EVI",tlab="Time",xlab="Pixel Value", type="heatmap")
The 3D surface and 3D line plots also show the seasons at play in influencing the probability of observing green pixels.
plot(Y, type = '3Dsurface', var=1,ylab = c("NDVI"),main = "Probability Kernel Density")
plot(Y, type = '3Dline', var=1,ylab = c("NDVI"),main = "Probability Kernel Density")
plot(Y, type = '3Dsurface', var=2,ylab = c("EVI"),main = "Probability Kernel Density")
plot(Y, type = '3Dline', var=2,ylab = c("EVI"),main = "Probability Kernel Density")
This part of the readme showcases how a user might go about beginning their analysis using FSSA and offers univariate and multivariate examples of how to visualize the fts objects. These plotting techniques are extremely useful in determining how the user should choose the lag parameter and can give insight as to what patterns the user should look for in their FSSA decomposition.